General Outline
This pretty much has the technical outline; next steps would be building the concrete models to perform the Machine Learning and Recommendation System tasks for specific UX components. To recap, my goal is to build an easy to use plug-n-play Web Service, that enables user behavior driven content discovery with small and middle sized businesses If you are in a hurry, check the pictures and skim through, the bullet points are for details. Technical Plug-n-Play Outline I call my platform OBHave! as in "Oh, Behave!", since afterall we are working with human behavior here. Idea of OBHave! is to work as a middle-end, that is able to analyze user behavior and manipulate the chosen content items fetched from a REST API. To make the Machine Learn, the front-end team needs to add events to UI interactions that are related to the content item navigation, or use ready made OBHave! components. Events are in standardized format (userID, contentItemID, optional: startTime, endTime, etc. the class is extendable). EventClass has edges to users by id and content items by id. Target of the content item edge can be routable item or non-routed item. Non-routed items are good for benchmarking OBHave! before actually using it; this way the Machine can Learn without affecting the existing software and the customer can start using OBHave! after it has proven it's capability to improve the UX by shortening the clickstream and spotting UX issues. The picture below shows a simple flowchart of how user event with routed content items work: In detail what goes on is: * UX component requests data with REST call that is routed to the matching OBHave! event class REST end-point provided by OrientDB. * OBHave! matches the user id and checks if there are ready recommendations for that specific content item type. Usually these content items are super-containers for sub-content items. ** In case we have a recommendation, the id's of the predefined content items of a recommendation set will be added to the request parameters of the predefined content item REST API route, if "cache" has expired. ** In case we don't have a recommendation, the request is forwarded as it is predefined. ** The original GET query parameters will be added as they were for both REST API calls. * The backend will process the request as it always has, no changes required unless the REST API can't process lists of resource ids in any standard way, in which case such a feature has to be implemented. * The result set will be stored to OBHave! for further analysis and existing resources will be updated. * The result set is returned to the UI as it would had been without OBHave!, though now the resources were chosen by the id's generated by the recommendation system, which used the user id to infer suitable content. Technical Overview and Possibilities I will use OrientDB, a multi-model graph-and-document database, to prototype OBHave! (it has built-in REST server, object oriented structure, which can be enhanced with event based JavaScript, besides all the traditional database features). Version 0.1 will need identification of unique users, configuration for the routes to the back-end and general OBHave! core features for the first UX use-case(s). Depending on the needs of the pilot customers and their willingness to financially participate we can build: Some details: * React.js and Flux-based UI/UX components that are easy to plug in to any website (later we could create proper templating and skin features with Node.js) * jQuery / plain JavaScript / some other JS HTTP wrapper for safe trial usage. When testing new technologies, reliability and proving your worth are important for bigger players. Thus we could build jQuery HTTP wrapper, that would route only 1%, 3%, 5%, 10%, 20%, 50%, 100% of the traffic to OBHave! for populating it with data. * Proving that the Machine Learning works could be achieved similarly: when we have populated the OBHave! and it has run some diagnostics, we can attach JavaScript events to target the resource clicks and prove that our recommendation system knows what the users are trying to find from the website (the recommendations match the real content user interacted with). * Some more random ideas include: ** Front-end cache with OrientDB (later we can combine it to Redis or Varnish or etc.) ** ACL with OriendDB that will learn user roles through REST (403 - Forbidden) or can be taught with other standard procedures of ACL management, so that there will be less traffic to the back-end due to ACL issues User session management, so that there will be less traffic to the back-end. E.g. Web-component based Facebook-login easy to integrate at any Web-app etc. Machine Learning (input, reinforced learning, unsupervised learning) Because computational power is constantly scaling up, the neural networks are becoming much more succesful than the traditional methods of Machine Learning with Recommendation Systems. My model is based on the simplistic assumption that there are two things that keep ideas in your consciousness: stress and rewards. When an idea is in your consciousness you will learn about it when your subconsciousness tries to combine the concept with your sensory data (usually the case when you are doing something rewarding) or through free association / history / experiences (usually the case when you are stressed). When your subconsciousness manages to combine work-able solution to the idea troubling your consciousness you have learned something. Based on this philosophy I have divided the user behavior to two clusters: The first I have labeled as the counscious cluster (inputs, reinforced learning): * The conscious cluster tracks all the conscious navigation efforts (input) the user has to do in order to find content to interact with; navigation should cause stress to the Machine, because there has obviously been problem with it's Learning. * The conscious cluster also tracks the interactions of the content and the users (reinforced learning rewards). When the Machine was able to serve the user with something meaningful and it performed better than before, it shall be rewarded, since it had Learned something useful about the user. * New theories are tested by giving them "a boost" in the recommendation system; if same theory was present and the Machine felt stress, there is a problem with the theory, if the theory instead was rewarding, then it should be learned to the behavior core of the Machine. The second I have labeled as the subconscious cluster (unsupervised learning): * It accomodates the current idea of Machines judgement about users and the content items, based on the learned theories. * A user can belong to multiple user groups and user groups are connected to content items with specific weights based on the groups cohesion etc. * Interaction groups are groups of events (and their parameters) that combine certain users and content items, these are the stress signals that attempt to create better links between the users and the content items, that would require less interactions. * Other groups represent different theories and connect with users and interactions; this is the free association layer where interactions cause stress. Theories are always connected to a "solution", which contains all the items which have caused stress related to the interactions. Those items will be recommended to the users by using the "solution" and if the stress gets relieved, the theory will gain power and might be learned. The Recommendation System (output, reinforced learning) Based on the user groups, interactions and theories the Machine creates new recommendation sets. Each recommendation set has more than the necessary content items, but it will be personalized and prioritized by users cohesion with the groups the recommendation was based upon. When the Machine's and the user's subconsciousness are in sync, there is no need for learning, but as the user starts to navigate more, the Machine starts to stress and attempts to learn more about the user. Recommendation sets are always related to specific (sub-)content item (super-)containers, which can be considered (sub-)content items themselves and contained by other (super-)containers (Flux architecture). At times recommendation set containers could require human assistance, like giving a proper label for it, in order to guarantee a better user experience. In such situations the human will be provided infromation about how it relates to the existing ones (combination of folk metal and hardcore metal might be hardcore folk metal). To conclude all this OBHave! will generate predefined content item id lists for container UI-objects. Each list has more items than it can accommodate; the specific items chosen differ by user, by time and other variables. Which list to retrieve for specific user depends upon the user groups, interaction groups and theory / solution groups.